Growth Algorithms for Lattice Heteropolymers at Low Temperatures 
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Two improved versions of the pruned-enriched-Rosenbluth method (PERM) are proposed and 
tested on simple models of lattice heteropolymers. Both are found to outperform not only the 
previous version of PERM, but also all other stochastic algorithms which have been employed on 
this problem, except for the core directed chain growth method (CG) of Beutler & Dill. In nearly 
all test cases they are faster in finding low-energy states, and in many cases they found new lowest 
energy states missed in previous papers. The CG method is superior to our method in some cases, 
but less efficient in others. On the other hand, the CG method uses heavily heuristics based on 
presumptions about the hydrophobic core and does not give thermodynamic properties, while the 
present method is a fully blind general purpose algorithm giving correct Boltzmann-Gibbs weights, 
and can be applied in principle to any stochastic sampling problem. 



I. INTRODUCTION 

Lattice polymers have been studied intensively to un- 
derstand phenomena like the globule-coil transition of 
polymers, protein folding, etc. Protein folding (or, more 
precisely, protein fold prediction), one of the central 
problems of computational biology, refers to the de- 
termination of the ground state of protein molecules 
- which grosso modo is also its native state - from 
their amino acid sequence. Due to rapid advances in 
DNA analysis the number of known sequences has in- 
creased enormously, but progress in understanding their 
3-dimensional structure and their functions has lagged 
behind owing to the difhculty of solving the folding prob- 
lem. 

Simplifying the description of a protein by replacing 
each amino acid by a simple point particle on a site of a 
regular lattice implies of course a great reduction of com- 
plexity, and one might wonder how much one can learn by 
this for real proteins. But even if this simplification is too 
strong, searching for lowest energy states of such mod- 
els represents a paradigmatic example of combinatorial 
optimization. This will indeed be our main motivation: 
Finding algorithms that explore efficiently the low-energy 
states of a complicated energy landscape with many local 
minima. In addition to finding the ground state we want 
these algorithms also to sample excited states correctly, 
so that they provide a complete thermodynamic descrip- 
tion - though we shall restrict ourselves in this paper to 
presenting results on ground states only. 

A popular model used in these studies is the so-called 
HP model 0, Q where only two types of monomers, H 
(hydrophobic) and P (polar) ones, are considered. Hy- 
drophobic monomers tend to avoid water which they can 
only by mutually attracting themselves. The polymer 
is modeled as a self-avoiding chain on a regular (square 
or simple cubic) lattice with repulsive or attractive in- 
teractions between neighboring non-bonded monomers. 
Although also other interaction parameters have been 
used in the literature, almost all examples treated in 
this paper use energies ehh = —1, ^hp = ^pp — 0. 
The only other model studied here has also two types 



of monomers, for simplicity also called H and P (al- 
though they have identical hydrophobicities), but with 
eHH = epp = "1, fffp — (^]. Chain lengths consid- 
ered in the literature typically are between TV = 30 and 
N = 100. Shorter chains do not present any problem, 
longer ones are too difhcult. 

A wide variety of computational strategies have been 
employed to simulate and analyze these models, includ- 



ing conventional (Metropolis)Monte Carlo schemes with 
various types of moves [3, 0, > chain CTowth algorit 
without 'j| and with re-sampling 0,13 (see also 



^wth algorithms 

__ - ^ , _ 1), 

genetic algorithms [T^, 13 , parallel tempering [13 , and 
generalizations thereof [ijj uM i 'evolutionary Monte 
Carlo' algorithm [To|. and others 0]. In addition, Yue 
and Dill uM ^^^^ devised an exact branch-and-bound 
algorithm specific for HP sequences on cubic lattices 
which gives all low energy states by exact enumeration, 
and typically works for A^^70 — 80. If the chain is too 
long, it does not give wrong output but no output at all. 

It is the purpose of the present letter to present 
two new variants of the Pruned-Enriched Rosenbluth 
Method (PERM) 19] and to apply them to lattice pro- 
teins. PERM is a biased chain growth algorithm with 
re-sampling ( "population control" ) with depth- first im- 
plementation. It has a certain resemblance to genetic al- 
gorithms, except that the latter are usually implemented 
breadth-first and do not allow to obtain correct Gibbs- 
Boltzmann statistics. 

The original version of PERM was used for lattice pro- 
tein folding in U Q and did extremely well. With one 
exception, it could find all known lowest energy config- 
urations for all sequences tested in 0, and found a 
number of new lowest energy states. The one case where 
it could not find the ground state in an unbiased and 
blind search was a 64-mer designed in (see Fig. 1), 
but this is not surprising: Any chain growth algorithm 
should have problems in finding this configuration, since 
it has to grow a long arc which at first seems very unnat- 
ural and which is stabilized only much later. Indeed, at 
that time no other Monte Carlo method had been able to 
find this state either. But a very efficient algorithm, the 
Core-directed Growth method (CG) was overlooked in 
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FIG. 1: Left side: ground state configuration of a A'' = 64 
chain in 2D from Other states with the same energy 

differ in the detailed folding of the tails in the interior, but 
have identical outer shapes. Right side: when about 3/4 of 
the chain is grown, one has to pass through a very unstable 
configuration which is stabilized only later, when the core is 
finished. 



Thus PERM was not tested on the most difRcult 
example known at that time, a 88-mer forming a (3/a- 
barrel whose ground state energy was known exactly. In 
the meantime, also two other improved Monte Carlo al- 
gorithms were published All this motivated us 
to take up the problem again. 



II. THE ALGORITHM 

PERM is built on the old idea of Rosenbluth and 
Rosenbluth (RR) j20| to use a biased growth algorithm 
for polymers, where the bias is corrected by means of 
giving a weight to each sample configuration. While the 
chain grows by adding monomers, this weight (which also 
includes the Boltzmann weight if the system is thermal) 
will fluctuate. PERM suppresses these fluctuations by 
pruning configurations with too low weight, and by "en- 
riching" the sample with copies of high-weight config- 
urations 01 . These copies are made while the chain 
is growing, and continue to grow independently of each 
other. PERM has been applied successfully to a wide 
class of problems, including e.g. the O transition in ho- 
moDolvmers Il9l , trapping of random walkers on absorb- 
ing lattices |24[. and stretching collapsed polymers in a 
poor solvent |25| . It can be viewed as a special real- 
ization of a "go with the winners" strategy .21] which 
indeed dates back to the beginning of the Monte Carlo 
simulation era, when it was called "Russian roulette and 
splitting" T2\. Among statisticians, this approach is also 
known as seq uential importance sampling (SIS) with re- 
sampling [23. 

Pruning and enrichment were done in 0, 0, llfl| | by 
choosing thresholds and depending on the esti- 
mate of the partition sums of n-monomer chains. These 



thresholds are continuously updated as the simulation 
progresses. If the current weight Wn of an n-monomer 
chain is less than , a random number r is chosen uni- 
formly in [0, 1]. If r < 1/2, the chain is discarded, other- 
wise it is kept and its weight is doubled. Thus low-weight 
chains are pruned with probability 1/2. Many alterna- 
tives to this simple choice are discussed in 23], but we 
found that more sophisticated strategies had little influ- 
ence on the efficiency, and thus we kept the above in the 
present work. The determination of and will 
be discussed later. In principle we could use the same as 
in 0, 13 J but we simplified it since the new variants are 
more robust, and some of the tricks employed in 0, Q 
are not needed. 

On the contrary, we found that different strategies in 
biasing and, most of all, in enrichment had a big effect, 
and it is here the present variants differ from those in 

0, Q. There, high -weight configurations were simply 
cloned (with the number of clones determined from the 
ratio of the actual weight to W^), and the weight was 
uniformly shared between the clones. For relatively high 
temperatures this is very efficient [T9l |. since each clone 
has so many possibilities to continue that different clones 
very quickly become independent from each other. This 
is no longer the case for very low temperatures. There 
we found that clones often evolved in the same direc- 
tion, since one continuation has a much higher Boltz- 
mann weight than all others. Thus, cloning is no longer 
efficient in creating configurational diversity, which was 
the main reason why it was introduced. 

The main modification made in the present paper is 
thus that we no longer make identical clones. Rather, 
when we have a configuration with n — 1 monomers, we 
first estimate a predicted weight W^'^°'^ for the next step, 
and we count the number fefi-oe of free sites where the n- 
th monomer can be placed. If W^""^ > W> and fca-ee > 

1, we choose 2 < fc < kf^cc different sites among the 
free ones and continue with k configurations which are 
forced to be different. Thus we avoid the loss of diversity 
which limited the success of old PERM. We tried several 
strategies for selecting k which all gave similar results. 
Typically, we used k = min{fcft.ec, \WP"''^/W>^}. 

When selecting a fc-tuple A = {ai, . . .at} of mutually 
different continuations aj with probability , the corre- 
sponding weights Wn,ai ■ ■ ■ , Wn,ak ^'^^ (sGC Appendix) 



Here, the importance 



Wn-iqajkhc 



(1) 



(2) 



of choice aj is the Boltzmann-Gibbs factor associated 
with the energy En,aj of the newly placed monomer in 
the potential created by all previous monomers, and the 
terms in the denominator of Eq. arise from correcting 
bias and normalization. 

For the choice of continuations among the /cfrco candi- 
dates, we used two different strategies: 
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(1) In the first, called nPERMss for "new PERM with 
simple sampling" , we chose them randomly and uni- 
formly, with the only restriction that they are mutually 
different. Accordingly, WP'""^ = Wn-ikf.^e and 



(3) 



This has the advantage of simplicity, but it might at first 
appear to be inefhcient. A priori, we would expect that 
some bias in favour of continuations with high Boltzmann 
weights or against continuations which run into dead ends 
might be necessary for efficiency. 

(2) In the second, called nPERMis for "new PERM 
with importance sampling", we did just that. For each 
possible placement a G [1, kf^cc] of the n-th monomer we 
calculated its energy En. a and its number k^^^^ of free 
neighbours, and used modified importances defined by 



l/2)exp(-/3S„,„) 



(4) 



to choose among them. The predicted weight is now 
^prod ^ Wn-iY.a^a- ^hc replacement of qa by qa 
is made since we anticipate that continuations with less 
free neighbours will contribute less on the long run than 
continuations with more free neighbours. This is similar 
to "Markovian anticipation" ^] within the framework of 
old PERM, where a bias different from the short-sighted 
optimal importance sampling was found to be preferable. 

The actual choice was made such that, for a given 
k (remember that k was already fixed by the ratio 
W^'^"'^ /Wn), the variance of the weights Wn is mini- 
mal. For k = I this is standard importance sampling, 
Pa = 'Za/ X^a' variance of W„ for fixed 
Wn-i would be zero if we had not replaced qa by qa'. 

Wn,a = Wn-iqa/Pa = Wn-iqa / qaJ2a' 9a' ■ For fc > 1, 

the probability to select a tuple A = {ai, ... a^} is found 
to be 



PA 



(5) 



The corresponding weights are determined according to 
Eq. IpQ. The variance of the weight increase Wn.a/Wn-i, 
summed over all k continuations within the tuple, would 
again be zero if qa were not replaced by g^. 

nPERMis is more time consuming than nPERMss, but 
it should also be more efficient. While Eq. (O with ija re- 
placed by qa would be optimal if the chain growth were 
a Markov process, it is not guaranteed to be so in the 
actual (non-Markovian) situation. We tried some alter- 
natives for PA, but none gave a clear improvement. 

A noteworthy feature of both nPERMss and nPERMis 
is that they cross over to complete enumeration when 
and Wn tend to zero. In this limit, all possible branches 
arc followed and none is pruned as long as its weight is 
not strictly zero. In contrast to this, old PERM would 
have made exponentially many copies of the same con- 
figuration. This suggests already that we can be more 
lenient in choosing and ■ For the first configu- 
ration hitting length n we used = and = oo. 



i.e. we neither pruned nor branched. For the follow- 
ing configurations we used = C Zn/ ZQ{cn/ and 
Wn = ^■2Wn ■ Here, c„ is the total number of config- 
urations of length n already created during the run, Z„ 
is the partition sum estimated from these configurations, 
and C is some positive number < 1. The following re- 
sults were all obtained with C = 1, though substantial 
speed-ups (up to a factor 2) could be obtained by choos- 
ing C much smaller, typically as small as 10~^^ to lO"^**. 
The latter is easily understandable: with such small C, 
the algorithm performs essentially exact enumeration for 
short chains, giving thus maximal diversity, and becomes 
stochastic only later when following all possible configu- 
rations would become unfeasible. We do not quote the 
optimal results since they are obtained only for narrow 
ranges of C which depend on the specific amino acid se- 
quence, and finding them in each case would require an 
extensive search. 

Since both nPERMss and nPERMis turned out to be 
much more efficient and robust than old PERM, we did 
not use special tricks employed in like growing chains 
from the middle rather than one of the ends, or forbidding 
contacts between polar monomers. 

In the following, when we quote numbers of ground 
state hits or CPU times between such hits, these are al- 
ways independent hits. In PERM we work at a fixed tem- 
perature (no anneafing), and successive "tours" are 
independent except for the thresholds Wn '^ which use 
partially the same partition sum estimates. The actual 
numbers of (dependent) hits are much larger. 

For both versions, results are less sensitive to the pre- 
cise choice of temperature than they were for old PERM. 
As a rule, optimal results were obtained at somewhat 
lower temperatures, but in general all temperatures in 
the range 0.25 < T < 0.35 gave good results for ground 
state search. 



III. RESULTS 

(a) We first tested the ten 48-mers from 0]. As with 
old PERM, we could reach lowest energy states for all of 
them, but within much shorter CPU times. As seen from 
Table □ nPERMis did slightly better than nPERMss, 
and both were about one order of magnitude faster than 
old PERM. For all 10 chains we used the same tem- 
perature, exp(l/T) — 18, although we could have op- 
timized CPU times by using different temperatures for 
each chain. In the following we quote in general only re- 
sults for nPERMis, but results for nPERMss were nearly 
as good. 

The CPU times for nPERMis in Table dare typically 
one order of magnitude smaller than those in ^] , except 
for sequence #9 whose lowest energy was not hit in 
Since in a SPARC 1 machine was used which is slower 
by a factor w 10 than the 167 MHz Sun ULTRA I used 
here, this means that our algorithms have comparable 
speeds. 



TABLE I: Performances for the 3-d binary (HP-) sequences 



from 4] . 


sequence 


-£'min 


perm' 


nPERMss'' 


nPERMis'' 


nr. 










1 


32 


6.9 


0.66 


0.63 


2 


34 


40.5 


4.79 


3.89 


3 


34 


100.2 


3.94 


1.99 


4 


33 


284.0 


19.51 


13.45 


5 


32 


74.7 


6.88 


5.08 


6 


32 


59.2 


9.48 


6.60 


7 


32 


144.7 


7.65 


5.37 


8 


31 


26.6 


2.92 


2.17 


9 


34 


1420.0 


378.64 


41.41 


10 


33 


18.3 


0.89 


0.47 



"Ground state energies 

''CPU times (minutes) per independent ground state hit, on 167 
MHz Sun ULTRA I work station; from Ref. |3| 
'^CPU times, same machine 
"^CPU times, same machine 



(b) Next we studied the two 2-d HP-sequences of 
length N = 100 of Ref. ,1^]. They were originally thought 
to have ground states fitting into a 10 x 10 square with en- 
ergies -44 and -46 |^ , but in ^ configurations fitting into 
this square were found with lower energies, and moreover 
it was found that the configurations with lowest energies 
{E = —47 resp. E = —49) did not fit into this square. 
In the present work we studied only configurations of the 
latter type. 

For the second of these sequences, new lowest energy 
configurations with E = —50 were found later in |14j . 
within 50 h CPU time on a 500 MHz DEC 21164A. We 
now hit this energy 7 times, with an average CPU time 
of 5.8 h on a 667 MHz DEC 21264 between any two hits. 

For the first sequence of Ref. we now hit several 
hundred times states with E = —48, with ca. 2.6 min 
CPU time between successive hits. One of these config- 
urations is shown in Fig. 2. 

(c) Several 2-d HP-sequences were introduced in 
where the authors tried to fold them using a genetic al- 
gorithm. Except for the shortest chains they were not 
successful, but putative ground states for all of them 
were found in 0, ITsL IT^ . But for the longest of these 
chains (with N = 64, see Fig. 1), the ground state en- 
ergy -Emin = — 42 was found in [g only by means of spe- 
cial tricks which amount to non-blind search. With blind 
search, the lowest energy reached by PERM was -39. We 
should stress that PERM as used in ;8] was blind for all 
cases except this 64-mer, in contrast to wrong statements 
made in [lO| . 

We now found putative ground states for all chains of 
[m with blind search. For the 64-mer the average CPU 
time per hit was ca. 30h on the DEC 21264, which seems 
to be roughly comparable to the CPU times needed in 
[TsL IT^ , but considerably slower than |§| . As we already 
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P = 



FIG. 2: Typical configuration with E = —48 of the first se- 
quence of Ref. • 



said in the introduction, this sequence is particularly dif- 
ficult for any growth algorithm, and the fact that we now 
found it easily is particularly noteworthy. 

On the other hand, nPERMis was much faster than 
i for the sequence with iV = 60 of [H. It needed w 10 
seconds on the DEC 21264 to hit E„,in = -36, and w 0.1 
second to hit = — 35. In contrast, E = —36 was never 
hit in 0, while it took 97 minutes to hit E = —35. 

(d) A 85-mer 2-d HP sequence was given in ||2^ , where 
it was claimed to have E'min = — 52. Using a genetic al- 
gorithm, the authors could find only conformations with 
E > —47. In Ref. T^, using a newly developed evolution- 
ary Monte Carlo (EMC) method, the authors found the 
putative ground state when assuming large parts of its 
known structure as constraints. This amounts of course 
to non-blind search. Without these constraints, the pu- 
tative ground state was not hit in [Tol | either, although 
the authors claimed their algorithm to be more efficient 
than all previous ones. 

Both with nPERMss and with nPERMis we easily 
found states with E = —52, but we also found many con- 
formations with E = -53. For nPERMis at exp(l/T) = 
90 it took ca. 10 min CPU time between successive hits 
on the Sun ULTRA 1. One of those conformations is 
shown in Fig. O 

(e) As two easy cases we studied the two longest se- 
quences from Il2l. since we can compare there with CPU 
times given in |l2l | for three versions of a supposedly very 
efficient genetic algorithm. These 2-d HP sequences with 
lengths = 33 and 48 have ground state energies -14 
and -23, respectively. In T^], the most efficient version 
needed on average ~ 45min CPU (on an unspecified ma- 
chine) to reach a ground state of the 33-mer. For the 
48-mer only energy -22 could be reached, within « 2.5h 
per hit. Using exp(l/T) = 40, it took the Sun ULTRA 
1 just 0.4 sec to hit one ground state of the 33-mer, 7 
sec to hit E = —22 for the 48-mer, and 16 min to hit a 
ground state of the 48-mer. Thus the present algorithm 
is roughly 1000 times faster than that of [T^ . 

(f) Four 3D HP sequences with N = 58, 103, 124, and 
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FIG. 3: New putative ground state configuration with E 
-53 of the 2-d A/" = 85 chain taken from jia..28,l . 



FIG. 5: Configuration with E = -54 of the N 
sequence modeling cytochrome c from Ref. |30 



103 HP 





FIG. 4: Configuration with E = -44 of the iV = 58 HP 
sequence modeling protein BPTI from Ref. [lMl29|. 



FIG. 6: Configuration with E = -71 of the N = 124 HP 
sequence modeling ribonuclease A from Ref. [iM l3n| . 



136 were proposed in |29|, |30j as models for actual pro- 
teins or protein fragments. Low energy states for these 
sequences were searched in using a newly developed 
and supposedly very efficient algorithm. The energies 
reached in |3| were E = —42, —49, —58 and —65, respec- 
tively. With nPERMis, we now found lower energy states 
after only few minutes CPU time, for all four chains. For 
the longer ones, the true ground state energies are in- 
deed much lower than those found in 0, see Table ITU 
Examples of the putative ground state configurations are 
shown in Figs. ^ to Fig. [T] 

Note the very low temperatures needed to fold the very 
longest chains in an optimal time. If we would be in- 
terested in excited states, higher temperatures would be 
better. For instance, to find E = —66 for the 136-nier 
(which is one unit below the lowest energy reached in 
[III), it took just 2.7 seconds/hit on the DEC 21264 when 
using exp(l/r) = 40. 

(g) Several 3-d HP sequences were studied in [Tsf. 
where also their exact ground state energies were calcu- 
lated using the 'constrained hydrophobic core construc- 
tion' (CHCC) which is essentially an exact enumeration 



method tailored specifically to HP sequences on the cu- 
bic lattice. According to [Tg, CHCC can be used to find 
all exact ground state configurations for chains of length 
iV R:^ 70 to 88, depending on their degeneracies. 

The longest chains given explicitly in together with 
their native configurations are a four helix bundle with 

= 64 and -Emin = —56, and a chain with N — 67 fold- 
ing into a configuration resembling an a/P barrel with 
Emin — —56, too. Both have low degeneracy. 



TABLE II: Performance for the 3-d HP sequences from "SC 



iV 


rp a 


P 6 
^min 


exp(l/r) 


CPU time " 


58 


-42 


-44 


30 


0.19 


103 


-49 


-54 


60 


3.12 


124 


-58 


-71 


90 


12.3 


136 


-65 


-80 


120 


110. 



"Lowest energies found in Ref. Il6l 
''Present work, using nPERMis 

■^CPU times (flours) per independent fowest state hit, on 667 MHz 
DEC ALPHA 21264 
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TABLE III: Newly found lowest energy states for binary sequences with interactions e = {eHH, ^hp, ^pp) 









old iSmin 


N d 




Sequence 


new -Emin Ref. 



100 2 -(1,0,0) PeHPH2P5H3PH5PH2P2{P2H2)2PH5PH^oPH2PH7PiiH^P2HPHsP6HPH2 -47 [7] 

-48 

85 2 -(1,0,0) HiPiH^2P(,Hi2PiHi2P3Hi2P-iHP2H2P2H2P2HPH -52 [JJ 

-53 

58 3 -(1,0,0) PHPHzPH3P2H2PHPH2PHsPHPHPH2P2HsP2HPHPiHP2HP2H2P2HP2H -42 [16] 

-44 

103 3 -(1,0,0) P2H2P5H2P2H2PHP2HPTHP3H2PH2P&HP2HPHP2HPsHzPiH2PH2PsH2P4. -49 [16] 

H4PHP8HSP2HP2 -54 
124 3 -(1,0,0) PiHzPHPiHPsH2PiH2P2H2PiHP4,HP2HP2H2PsH2PHPH3P4,H3P(.H2P2HP2 -58 [1^ 

HPHP2HP7HP2H3P4HP3H5P4H2PHPHPHPH -71 
136 3 -(1,0,0) HP5HPiHPH2PH2PiHPH3PiHPHPHAPiiHP2HP3HPH2P3H2P2HP2HPHPHPsH -65 fl6] 

PiHeP3H2P2HaP3H2PH5P9HP4HPHPi -80 





FIG. 7: Configuration with £ = -80 of the = 136 HP 
sequence modeling a staphylococcal nuclease fragment, from 
Ref. [llE^l- 



FIG. 8: Ground state configuration {E = -56) of the iV = 67 
HP sequence given in fl^ . It forms a structure resembling an 
a/P barrel. When starting at monomer #67 (/3 sheet end), 
nPERMis could find it easily, but not when starting from 
monomer #1. 



Finding ground states for the 64-mer was no problem 
for nPERMis. For exp(l/r) = 50, the DEC ALPHA 
21264 machine needed on average 26.8 min CPU time to 
hit one of them. Things are a bit more interesting for 
the 67-mer. One of its ground states is shown in Fig. |S1 
Assume we want to let this grow, starting from the /? 
sheet end (monomer #67). Then we sec that we always 
can form immediately stabilizing H-H bonds, and that 
we would be never seriously misled if we would place 
monomers greedily, at positions where they have low en- 
ergies. Indeed, starting from this end we had no problems 
with nPERMis: It took on average 67 min to hit a native 
state on the DEC ALPHA 21264. 

On the other hand, when starting with monomer #1, 
we were unsuccessful and the lowest energy reached was 
E = —53, even after much longer CPU times. This is 
easily understood from Fig.|Hl starting from this end we 
have to go repeatedly into directions which seem very 
unnatural at first sight, and which get stabilized much 



later. 

Notice that the difference between the two growth di- 
rections is not that there is a folding nucleus when start- 
ing from #67, and no folding nucleus when starting from 
#1. After the first quarter is built up, both give the same 
a/P pair. Building this first quarter is no problem even 
when starting from #1, at least when we use C ^ 1 (in 
which case it is built essentially by complete enumera- 
tion). Thus the existence of a nucleus in the traditional 
sense is not sufficient. Instead it is crucial that further 
growth from this nucleus does not lead into false minima 
of the energy landscape. 

(h) Next wc studied the two-species 80-mer with in- 
teractions (-1,0^-1) that was introduced in Q. It was 
constructed in Q such as to fold into a four helix bun- 
dle with E = —95, but two configurations with E = —98 
were found in 7] which essentially are /3 sheet dominated. 
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FIG. 9: Ground state configuration {E = -72) of the Af = 88 
HP sequence given in 0. It also forms a structure resembUng 
an a/ P barrel, with the core (the 4 (3 strings) built from the 
central part of the chain. Without this core being already 
present, folding from neither end is easy. 



These configurations were hit on average once every 80 
hours on a 167 MHz Sun ULTRA 1. Later they were also 
found in , with similar CPU time as far as we can tell. 
With nPERMis we needed only 5.3 hours / hit, on the 
same Sun ULTRA 1 (and for 8 < exp(l/T) < 12). 

(i) Finally we also studied the 3-d HP sequence of 
length 88 given in j^. As shown there, it folds into 
an irregular /J/a-barrel with i?niin = —72. This is the 
only chain whose ground state we could not find by our 
method, instead we only reached E = —69. This is in 
contrast to the CG method which could find the lowest 
energy easily . The difficulties of PERM with this se- 
quence are easily understood by looking at one of the 
ground states, see Fig. El The nucleus of the hydropho- 
bic core is formed by amino acids #36-53. Before its 
formation, a growth algorithm starting at either end has 
to form very unstable and seemingly unnatural struc- 
tures which are stabilized only by this nucleus, a situa- 
tion similar to that in Fig. 1. In order to fold also this 
chain, we would have either to start from the middle of 
the chain (as done in Q for some sequences) or use some 
other heuristics which help formation of the hydrophobic 
core. Since we wanted our algorithm to be as general and 
"blind" as possible, we did not incorporate such tricks. 
The CG method, in contrast, is based on constructing 
an estimate of the hydrophobic core and the hydrophilic 
shell, and letting the chain grow to fill both in an optimal 
way, using a heuristic cost function. 

Before leaving this section we should say that for all 
chains studied in this paper we found also states with 
E = i?min + 1,-Emin + 2,.... Thus uoue of the se- 
quences showed an energy gap above the (putative or 
exact) ground state. If such a gap is indeed typical for 
good folders, then none of the above sequences should be 
considered as good folders. 

A list containing all sequences for which we found new 



lowest energy configurations is given in Table UTTI 



IV. DISCUSSION 

In the present paper we presented two new versions 
of PERM which is a depth-first implementation of the 
'go-with-thc-winners' strategy (or sequential importance 
sampling with re-sampling). The main improvement is 
that we now do not make identical clones of high weight 
(partial) configurations, but we branch such that each 
continuation is forced to be different. We do not ex- 
pect this to have much influence for systems at high 
temperatures, but as we showed, it leads to substantial 
improvement at very low temperatures. The two ver- 
sions differ in using simple sampling (nPERMss) resp. 
importance sampling (nPERMis) when choosing among 
possible branches. 

Although the method could be used for a much wider 
range of applications (see jsj for applications of PERM) , 
we applied it here only to lattice heteropolymers with two 
types of monomers. These represent toy models of pro- 
teins, and we hope that our results will also foster appli- 
cations to more realistic protein models. We showed only 
results for lowest energy configurations, but we should 
stress that PERM and its new variants are not only op- 
timization algorithms. They also give information on 
the full thermodynamic behaviour. We skipped this here 
since finding ground states is the most difficult problem 
in general, and sampling excited states is easy compared 
to it. 

Comparing our results to previous work, we see that 
we found the known lowest energy states in all cases but 
one. Moreover, whenever we could compare with previ- 
ous CPU times, the comparison was favourable for our 
new algorithms, except for the CG method of Beutler 
and Dill ;9j. But we should stress that the latter is very 
specific to HP chains, uses strong heuristics regarding 
the formation of a hydrophobic core, and does not give 
correct Boltzmann weights for excited states. All that is 
not true for our method. In general nPERMis did slightly 
better than nPERMss, although the difference was much 
less than a priori expected. 

In principle, essentially the same algorithms can also 
be used for off-lattice systems. This was already true for 
the original version of PERM which performed well for 
Lennard- Jones polymers at temperatures around the Q- 
transition , but rather badly for collapsed heteropoly- 
mers at temperatures much below the Q temperature 
|33| . Work is presently in progress to see whether the new 
versions of PERM perform better, and whether they can 
be used efficiently to study protein folding with realistic 
interactions. 

Acknowledgements: We are indebted to Ralph Andrze- 
jak for useful discussions and for critically reading the 
manuscript. WN was supported by the DEC (SFB 237). 
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Appendix 

In this appendix we shall collect some basic facts about 
random sampling, when tuples of instances are selected 
instead of individual instances. The discussion will be 
very general. On the other hand, we will not deal with 
problems specific to sequential sampling, i.e. we will as- 
sume that we sample only for the choice of a single item 
(e.g. for the placement of a single monomer). 

Our central aim is thus to estimate a partition sum 



N 



(6) 



K < N, since otherwise this would amount to an exact 
summation of Z. An advantage of such a strategy should 
be that we obtain a more widely and uniformly spread 
sample. When N ^ K, this should not have a big effect, 
but in our applications both N and K are small and the 
effect is substantial. 

Thus each event consists in choosing a K-tuple 
{ii, i2, . . . , iK}, with the ij mutually different, from some 
probability distribution Pi^^i2,...,iK ■ We consider tuples 
related by permutations as identical, i.e. without loss of 
generality we can assume that «i < Z2 < . . . < ik- Each 
choice a of a tuple {«i(a), 12(0), . . . ,ii<-(a)} will lead to 
an estimate Zi<-(a). Instead of Eq.® we have now 



where the importances qi might e.g. be Boltzmann-Gibbs 
factors, and where N is assumed to be finite (the gener- 
alization to infinite N and to integrals instead of sums is 
straightforward). A conventional Monte Carlo (MC) pro- 
cedure consists in choosing 'instances' i{a), a = 1,2,... 
with probabilities Pi(a) such that each instance gives an 
unbiased estimate Zi{a) (the index "1" will be explained 
in a minute). Thus, given M such instances and letting 
M tend to infinity, we have 



Z = lim 

M^oc 



1 



M 

M E 

a=l 



Zi{a). 



One easily sees that 



Zi{a) 



Ilia) 
Pi(a) 



does the job. Indeed, 



lim 



1 M 

1 \ - gt(a) 



N 



Pi 



(7) 



(8) 



(9) 



At the same time we can also estimate the variance of 
Zi. We have 



VarZi = {Zl)-{Zif 

N / X 2 



N 2 



Up to now everything is correct for any choice of the 
probabilities pi. They get fixed e.g. by pi — 1/N (uni- 
form sampling) or by demanding VarZi to be minimal, 
under the constraint X^iP* ~ 1- This simple variational 
problem gives p°^^ oc qi which is known as importance 
sampling. For perfect importance sampling one finds fur- 
thermore that VarZi = 0. 

Let us now assume that we select each time not one 
instance but K instances, all of which are different. This 
requires of course K < N. Moreover we will assume 



Zxia) 



^i,K)Pil{o')-..iK(a) 



(10) 



since one verifies easily that {Zk{ol)) — Z. 

The variance of Zk{(x) is calculated just like that of 
Zi(a), 



iV- 1 
K - 1 



E 



Pii...iK 



ii<...<iK 



-Z^ . (11) 



Importance sampling is again obtained by minimizing it 
with respect to Pi^...iKi giving the result 



opt 



Ef=l <l^k 



(12) 



The variance of Zk vanishes again for this choice. 

On the other hand, for uniform (or "simple") sampling, 
with 



P 



(13) 



we obtain 



Var^A- 



[N - K)N^ 
K{N - 1) 



Var q . (simple sampling) 

(14) 

For K ^ 1 this is the obvious result VarZi = N^Varq, 
while for K — N it gives Var^^r = as it should. For 
general 1 < K < N the factor 1/K is trivial and results 
from the fact that each event corresponds to K instances, 
while the factor {N — K)/{N — 1) gives the non-trivial 
improvement due to the fact that only different instances 
are chosen in each event. 

Finally, when using Ea. (|10|l for sequential sampling, 
one has to attribute weights to each individual instance, 
instead of giving a weight only to the entire tuple. The 
obvious solution is 



N 



K{j^)p^,(^c)...tK{a) 



(15) 
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